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Abstract 



Base station cooperative transmission, which is also known as coordinated multi-point (CoMP) 
O . transmission, is a promising technique to improve spectrum efficiency in future cellular systems. How- 

ever, they need large signalling overhead to gather the channel information. In this paper, we consider 
^ ' low feedback user scheduling in downlink coherent CoMP systems exploiting their inherent channel 

^n : 

QQ , asymmetry. Through the analysis of the statistics of the angle between channel vectors and the tightness 



of a lower bound of the orthogonally projected norm, we show that channel norm provides sufficient 



l/^ ■ information to judge the orthogonality among users in asymmetric channels. Based on this observation, 

o 

(<— ^ ' we propose a channel norm-based user scheduler (NUS), a local channel aided NUS (LocalNUS) and 

a large-scale fading-based user scheduler (LUS). Simulation results show that the LocalNUS performs 
very close to the existing greedy user scheduler (GUS) and semi-orthogonal user scheduler (SUS) with 

j_j ■ full channel state information but requiring much lower feedback overhead, the NUS performs close to 

C^ ' 

or even outperforms the GUS and the SUS when limited feedback is considered, and the LUS is robust 

to time-varying channels. 
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I. Introduction 

Inter-cell interference (ICI) is one of the major bottlenecks to provide high spectral efficiency 
in universal frequency reuse cellular networks. Except for various interference mitigation tech- 
niques, the concept of cooperative base station (BS) transmission, also known as coordinated 
multi-point (CoMP) transmission, has attracted much attention recently [[I]-31. 

A typical centralized CoMP system consists of a control unit (CU) and multiple BSs connected 
to the CU via low-latency backhaul links, where the BSs can be either the distributed antenna 
heads or the BSs in different cells. As the most promising transmit strategy of CoMP, coherent 
cooperative transmission using multiuser (MU) precoding can convert ICI into desired signals, 
with which both the cell-average and the cell-edge throughput can be significantly improved [ | l|j4|| . 
The centralized CoMP system is often regarded as a multiple-input multiple-output (MIMO) 
system with a "super" BS in literature. This allows existing precoding and user scheduling 
methods designed for single-cell MU-MIMO systems to be applicable. Nonetheless, there are 
distinct differences between CoMP MU-MIMO and single-cell MU-MIMO systems, e.g., per-BS 
power constraints (PBPC) |[I1I3, asynchronous interference [3J and the composite channels f5]. 

One fundamental difference between CoMP and single-cell systems lies in the channel char- 
acteristic. In single-cell systems the channels from multiple antennas at the BS to each user have 
the same average energy, whereas in CoMP systems the average channel energy from different 
BSs to each user differs. Such a feature is inherent in CoMP channels, which is named channel 
asymmetry in this paper. It has large impact on the performance of precoding and scheduling. For 
instance, the composite channel consisting of both large-scale and small-scale fading experienced 



by the users in CoMP systems provides a multiuser diversity gain of ^/2\ogK rather than the 
well-known log log i^, where K is the total number of users [i5J. The change of channel feature 
also provides opportunities to design novel transmission strategies, e.g., [6]. 

Channel-aware user scheduling is critical in CoMP MU-MIMO systems, where multiple users 
located in different cells may be selected and served concurrently by several coordinated BSs. 
When channel state information (CSI) is available at the CU, many scheduling algorithms such 
as the greedy user scheduler (GUS) [7J and semi-orthogonal user scheduler (SUS) flU can be 
applied. However, in the context of CoMP transmission, this leads to large signalling overhead 
which may counteract the performance gain. 



Low feedback user scheduling has been studied extensively for single-cell MU-MIMO systems, 
see [|9]-[TT| and references therein. In [|9l, the authors point out that both channel direction infor- 
mation (GDI) and channel quality information (CQI) are essential to achieve the full multiuser 
multiplexing and multiuser diversity gain. In [[TOll . it is shown that the combination of channel 
norm with long term channel statistics in the form of channel mean and covariance matrix is 
sufficient for both precoding and scheduling. Considering the fact that most of the feedback 
overhead comes from the scheduling, a two-phase feedback strategy is proposed in fTT\. where 
all users feed back a rough channel information for scheduling in the first stage and only the 
selected users feed back the refined channel information for precoding in the second stage. 

In this paper, we design low feedback schedulers by exploiting the channel asymmetry of 
CoMP systems. We show that such an asymmetric channel exhibits a special spatial correlation. 
We begin with analyzing the impact of the asymmetry on the orthogonality between users by 
characterizing the statistic of the angle between their channel vectors. With the derived probability 
density function (pdf) of the angle between two spatially correlated complex Gaussian vectors, 
we will show that the orthogonality between users is largely dependent on their locations. Based 
on the observation, we propose three channel norm-based user schedulers. 

For clarification, we call the BS that provides the maximum received power to each user as 
its local BS, and the other coordinated BSs as non-local BSs. We name the channel from each 
BS to a user as a sublink channel, and the channel from the local and non-local BS to the user 
as the local and cross channels, respectively. We will derive a lower bound of the orthogonally 
projected norm of one user's channel vector onto the subspace spanned by the channel vectors of 
the other selected users. By maximizing the lower bound, we develop a channel norm-based user 
scheduler (NUS), with which all users only feed back the instantaneous norms of their sublink 
channels. To tighten the bound and hence to improve the performance, we proceed to propose 
a local channel aided NUS (LocalNUS), with which the users feed back both the full local 
channel and the norms of the cross channels. In [I121I13II . another channel norm-based scheduler 
is proposed, where the users with the largest channel norms are selected. The NUS we proposed 
can select the users with both high orthogonality and high signal-to-interference-plus-noise ratio 
(SINK), which is very different from that in II12[|13II . 

The NUS and the LocalNUS are applicable for the systems with two-phase feedback strategy. 
In time-varying channels, the resulting scheduling delay may lead to worse orthogonality between 



the users scheduled with the outdated channels in the first stage. To address this issue, we propose 
a large-scale fading gain based user scheduler (LUS), which is based on the fact that the channel 
norm is dominated by the large-scale fading in CoMP systems. Since the large-scale fading gain 
varies slowly, the LUS can be quasi-statically performed over a relatively long period. 

Limited feedback schemes are often applied for providing CSI to the BS over a capacity- 
constrained feedback link [fT4l|. The number of feedback bits per user is limited not only by the 
capacity of its own feedback link [TTSl . but also by the capacity of the overall feedback link 
shared by all users lfT6ll . Since the proposed schedulers need very low feedback from all users in 
the first stage and only the selected users feed back the quantized channels for precoding in the 
second stage, more precise CSI can be provided for precoding than with the GUS and the SUS, 
which are usually applied in the systems with one-phase feedback strategy. Simulation results 
demonstrate the performance gain of the proposed schedulers over the GUS and the SUS. 

The rest of the paper is organized as follows. The system model is described in Section HIl 
The asymmetric feature of CoMP channels and its impact on the orthogonality between users 
are investigated in Section III. Three low feedback user schedulers are proposed in Section 
IV, and their feedback requirement is analyzed in Section V. The performance of the proposed 
schedulers is compared with existing schedulers using Monte-Carlo simulation in Section VI, 
and concluding remarks are given in Section VII. 

Throughout this paper, boldface upper and lower case letters denote matrices and row vectors, 
and standard lower case letters denote scalars. The superscripts (•)^ and (■)^ denote the transpose 
and the conjugate transpose of a vector or a matrix, respectively. 3f?{-} and Q^j-} denote the real 
and imaginary parts, and Ej-} denotes the expectation operator. A^/^ and | A| respectively denote 
the Hermitian square-root and the determinant of matrix A. ||a|| denotes the Euclidean norm of 
vector a, and diag{a} represents a diagonal matrix with the elements of a. Finally, I denotes 
the identity matrix, and denotes the vector of zeros. 

II. System Model 

Consider a frequency division duplex (FDD) downlink centralized CoMP systems, as shown 
in Fig. \T\ Through the downlink training, both local and cross channels are available at the user. 
Then the channel quantization is fed back from each user to its local BS, and all coordinated 
BSs forward the quantized CSI from local users to the CU through low-latency backhaul links. 



Then the CU selects users, computes transmit preceding for the scheduled users, and sends the 
scheduling results and the preceding weighting vectors to all coordinated BSs for transmission. 

Let (M, Nt, K) denote the network layout of the CoMP system consisting of M coordinated 
cells, each of which includes one BS equipped with A^^ antennas and K single-antenna users. Let 
ikm denote the index of the A;th user located in the mth cell, ikm = K{m — 1) + k, m = 1, . . . , M, 
k = 1, . . . , K. The global channel vector of user ikm is denoted as hj^^ = [hj^^i, . . . , hi^^M] G 
£^ixMNt^ where hj^.^^ = ^ajj,^„hjj.^„ G C^^^* is the sublink channel vector from the nth BS 
to user ikm, oHk^n denotes the large-scale fading gain including pathloss and shadowing, and 
^ikmn ~ CAf{0, Rjfe^n) is the small-scale fading channel vector following complex Gaussian 
distribution, n = 1, . . . , M. 

For linear preceding, at most MNt users can be served simultaneously. Let T = {in, . . . , irm} 
denote the total user pool, and S = {si, . . . , sl} denote the set of indices of L scheduled users 
that is a subset of T, i.e., S C T. Then the signal received by user s/ is 

Vs, = KW^^ + Zs,, (1) 

where x G C^^^ is the data symbols for the users in S, Zgi denotes the receiver noise and 
the interference from non-cooperative cells at the user si that is modeled as an additive white 
Gaussian noise (AWGN) with zero mean and variance a1 = J2mex Pmdsim + cr^^ ^ is the set 
of indices of non-cooperative cells, Pm is the maximal transmit power of the mth BS, and 
W = \w^ ^ . . . , wf^] G C*^^*^^ is a linear preceding matrix for all scheduled users. The SINR 
of user Sl is 

where Y^. /, Iho.w^P is the inter-user interference. 

When perfect CSI is available, the inter-user interference can be eliminated by using zero- 
forcing (ZF) transmission with the precoder as [[D 

W = GPi (3) 

where G = Hf (HsHs^)"^ is the ZF beamformer (ZFBF), H5 = [hj^, . . . , hj^]^, and P is 
a diagonal power allocation matrix. Combined with the low-complexity scheduler SUS or GUS 
|I71[8]|, the ZFBF achieves a sum rate that asymptotically grows with the number of users in the 
same way as dirty paper coding under sum power constraints [I81I17II. 



III. Channel Asymmetry in BS Cooperative Transmission Systems 

In this section, we analyze the impact of the inherent asymmetric feature of CoMP channels on 
the orthogonality between users by deriving the pdf of the angle between their channel vectors. 

A. The Angle Between Channel Vectors of Users 

Consider two users with the global channel vector hi and h2. Assume that the sublink small- 
scale fading channels are modeled as 

hi^ = gi„RL, i = 1, 2, m = 1, . . . , M, (4) 

where gjm ~ CA/'(0,I), and Rj^ is the spatial correlation matrix of the sublink channel. Then 
we can express the ith user's global channel vector as 

h. = &Rf, (5) 

where R^ = diag{[aiiRii, • • • ,aiMRiAf]}, and gj = [ga, • • • , giA/]- This shows that CoMP 
channels are a special kind of spatially correlated channels, which transmit correlation matrix 
depends on the sublink channel correlation matrices as well as the large-scale fading gains. 
The angle between the two users' global channel vectors is expressed as 

IVi Vi-f^l^ IV, ^rH\'i 

''°' ^ = lib IPIIb l|2 = Th"^' ^^^ 

||Il2|| II ill II ll'-^2|| 

where vi = hi/||hi||. When the transmit correlation matrix is a scaled identity matrix (i.e., 
Rj = cl) for 2 G {1,2}, cos^6' has been shown to follow a beta distribution with parameters 1 
and A^ — 1 11181 , where A^ = MNt. Since CoMP channels are asymmetric, i.e., different sublink 
channels have different large-scale fading gains, Rj is no longer a scaled identity matrix. 

To obtain the pdf of cos^ 6*, we define g„ = |h2V^p and q = [gi, . . . , gAr], where V = 
[vf , vj, . . . , v^]^ is a standard orthogonal basis generated from Vi. Then we can rewrite © as 

The joint pdf of q, /q(x), is derived in Appendix lAl By a change of variables, we can obtain 
the pdf of cos^ 9 as 

In the following, we will use numerical results to analyze the impact of the asymmetric channel 
on the orthogonality between users. 



cos^g= _;^ . (7) 



B. Numerical Analysis 

To highlight the impact of channel asymmetry on the orthogonality, here we take a simple but 
fundamental CoMP system consisting of two single- antenna BSs as an example. In this case, 
the channel correlation matrix of the i\h user is Rj = diag{[aji, aj^2]}j which depends on the 
user's location, i = 1,2. We consider two BSs and two users that are located at the positions of 
—d, di, d2 and d, as shown in Fig. |2l d =250 m. We only consider the pathloss which is set as 
ttim = —35.3 — 37.6 log((ijm) dB, where dim is the distance from the mth BS to the ith user. 

Figure [3] shows thepdf of cos^ 9 with respect to different user locations (rfi, ^2). When the users 
are located at the boundary of two cells, i.e., the case of (0,0), cos^ ^ is uniformly distributed 
between and 1. This agrees with the results in ifTSl since in this specific scenario the CoMP 
channel degenerates to a traditional single-cell channel with a two-antenna BS and the beta 
distribution reduces to the uniform distribution. When both users are in the same cell such as 
the cases of (50, 100) and (100, 100), cos^ 9 has large values in a high probability. By contrast, 
when the two users are located in different cells such as the cases of (—50, 100) and (—100, 100), 
cos^ 9 has small values in a high probability, i.e., the two users are more orthogonal. Resembling 
the single-cell MIMO systems [fTSll . when each BS equips with more antennas, the phenomena 
is the same except that the users will be orthogonal with a higher probability. 

IV. Low FEEDBACK USER SCHEDULERS 

We have shown that in CoMP channels user locations have large impact on the orthogonality 
between users. When the users are in different cells, the orthogonality depends on the large- 
scale fading gains, and in fact can be judged by channel norms as will be shown in this 
section. Otherwise, sublink small-scale channels are necessary to judge the orthogonality. This 
suggests the possibility of designing a low feedback scheduler to select the users with sufficient 
orthogonality based on user locations and channel norms. In the following, we derive several 
channel norm-based schedulers according to the observation. To exploit the orthogonality of 
users, we consider schedulers bearing the spirit of the SUS. 

A. Channel Norm-based Scheduler (NUS) 

Two critical factors of the SUS are the orthogonally projected norm and the orthogonal 
threshold [81 . 



In the (/ + l)th iteration of the SUS, we need to compute the orthogonally projected norm of 
the ifcmth user's channel vector hj^^^ onto the subspace spanned by the selected users' channel 
vectors H^^ as 

where Q^^ = I — H^(H5;H^)~^H5j is the orthogonal projection matrix, H^^ = [h^^, . . . , h^]^, 
h^,. = [h^,!, . . . , hs^Ai], I < i < I, ikm ^ 71+1, 77+1 is the user pool in the (/ + l)th iteration, and 
iS; = [si, . . . , si] is the scheduling result before the (/ + l)th iteration. 

To obtain the orthogonally projected norm, the SUS needs full CSI from all users. In order to 
derive a scheduler based on channel norms, we derive a lower bound of I'Siik,^ that only depends 
on the norms of the sublink channels. 

Define S = diag{[||hsj|^, . . . , ||hsj|^]}, and S = H^^H^ — S. Then by using the matrix 
inversion lemma lfT9l App. A], we have 

(H5,Hg)"' = S-i - S-i (S-i + S-i)"' S-i ^ S-i - A. (10) 

Substituting ([TOl) into dH), we have 

^5...„ = h.,„ (I - Hg (S-i - A) H^J hj^ 

= h,„ (I - HgS-iH,J hj^ + h,,^H|AH5,h,t 

>hi, (l-HfS-^H^Jhf = llhi, f ( 1- Vcos^^i, ,, I , (11) 

— '■km \ O; ^l / T'km " ''km II I / y ''km^j I " ^ -^ 

\ j=i J 



where 



Ih- h-^l 

n I «fem Sj I /ION 

cosb'j. s = IT, nil/ II (12) 

1 1 ^km 1 1 1 1 -^7 1 1 



is the cosine of the angle between user 4^ and user Sj, < Oi^^s- < 7r/2, i.e., cos^j^^s > 0. 
Considering the property of the inner product of vectors, cos Oi^^sj is upper bounded by 

E*^ h h^ I (h\ sr^M 111 II 111 II 
_ n=l "»fcm""sjnl W Z^n=l II ^»fcm"ll II ^^^j"!! A 

'^'^^^ifcmSj ~ / — / — — I — J — ~ ^^ikmSj1 

\^ lIVi 112 /V^ lll-i 112 Sr^ IIVi 112 /V^ lll-i 112 

y l^n=l\\"-ikmn\\ Y Z^n=l ll^Sjnll y 2^n=l ll^^fcm" II y 2^n=l ll^«j"ll 

(13) 
then we get the lower bound of the orthogonally projected norm as 

^S,^, >\\K f fl-V/^^ s 1=4\ • (14) 

^l^km — II ifemll I / J r^lkmSj j Ollkm ^ ' 

\ i=i / 



One can see that u^^- only depends on the norms of the sublink channels from cooperative 
BSs. Assuming homogeneous users, the SUS selects users merely based on I'siik^ [8]. Since in 
CoMP systems we need to consider heterogeneous users which experience different interference 
from non-cooperative BSs, we select users based on the term u^^- /af , which reflects the 
receive signal-to-noise ratio (SNR) with ZFBF. 

As a successive user scheduler, the SUS terminates the scheduling procedure by controlling 
the orthogonality between the selected users. Analogously, we introduce a threshold e to ensure 
f^ikms- — ^' i-^' ^^ restrict the upper bound of cosOij^^g.. This is equivalent to introduce a 
constraint on the lower bound of the angle between the channel vectors of the two users. 

Now we consider the tightness of the lower bound z/^'j and the upper bound /ii^^s by 
analyzing the tightness of the three inequalities (a), (b) and (c) in (fTTI) . (fT3l) and (fT4l) . 

The equality of (a) holds when A = 0. This implies that the users in Si are orthogonal among 
each other since in this case H^^H^ = S is diagonal according to (flOl) . In the proposed NUS, 
the scheduled users are semi-orthogonal which is ensured by judiciously choosing the threshold 
e. Thereby the gap of the two terms in the left and right hand sides of inequality (a) is small. 

By examining inequality (6), we will see that the tightness of /iij.„s depends on the users' 
location. In the first case, when user i^m and the selected users in S are in different cells, 
jjii^.^s- is tight. To see this, noting that the signals from non-local BSs experience much stronger 
attenuation, at least one of the two terms, ||hjj.^„|| and ||hs „||, will be very small, and hence 
/ijj.^s is small. Considering inequality (6) and the fact that cosOi^^s > 0, we have fJ^i^^s- — 
008 6*,, , < u,, , , which means that u,, , is a tight upper bound of cos 6',, , . Further 
considering inequality (c), we know that Ug^ is also a tight lower bound. In the second 
case, when user i^m and user Sj are in the mth cell, both ||hj^^„|| and [jh^ „|| will be very 
small for 72 7^ m. Then we can ignore all other M — \ terms in the sum of the numerator in 
(fT3l) except for the term with n = m, and the tightness of fii,,^s can be approximately reflected 
by ||hi^,^,„||||h^^„|| - |hj^^^hj^|. It is zero if every BS has only one antenna, then fii^^s, 
is tight. Otherwise, jJ^i^^s- is not tight since |hjj,^mhf^^| may be zero when the small-scale 
fading channels are orthogonal, thereby z/|'j is not tight as well. However, we know from 
the numerical results in Fig. |3] that the angle between two users in the same cell is small with 
a high probability. Therefore, i^i^^^s- is often too large to satisfy the orthogonality constraint. 
This implies that the NUS will select same-cell users with a low probability. We will show the 
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tightness of the lower bound through numerical analysis in section VI. 

Let Ti and Si denote the user pool and the scheduling result at the /th step, and set To 
{1, 2, ... , MK}, then the NUS is summarized as follows. 

1) Initialize by selecting a user with the maximum SNR as the first user, 

\^ 



Si = arg max ""'^"' — . (15) 



Set Si = {si} and / = 1. 
2) When / < min(MA^i, MK), obtain the user pool Ti as 



Tl = [ikm e Tl-l,ikm ^ Si I |J,i^^^Sl < c}. (16) 



If Ti = (p (empty set), the iteration will stop. Otherwise, compute the lower bound u^^ 
of the orthogonally projected norm and select the user with the largest u^^- /af , 



lb 



s;+i = arg max — ^-^. (17) 

''km 

Set Si+i = SiU {s^+i} and 1 = 1 + 1, where U denotes the union between two sets. 
We next discuss the selection of the orthogonality threshold e by analyzing its influence on 
the performance. With the full CSI, the SNR of the selected user si+i under ZF precoding is 

7.,,, = ^f^±i^, (18) 

'^Sl + i 

where p^^^^ is the allocated power to the user s^+i. The SNR can be lower bounded by 

7..+1 > ——2 = P..+1 max —^ (19) 

since u^^^^^ ^ is the lower bound of i^Sisi+i and the user is selected according to (flTI) . Considering 
that Hif^^sj < e for all i^m £ Ti, according to (fT4l) the SNR can be further lower bounded by 

%,+. > P..+.(l - /e^).max ^f^i. (20) 

''km 

It follows that the threshold e has an intertwined impact on the performance. On one hand, 
the non-orthogonality between users reduces the receive power which is reflected in the term 
^,=1 cos^ (^ikmSj ill (HTI) . Consider that cos 9i^^sj < f^ikmSj ^ e^ such a reduction is upper bounded 
by Ze^, which becomes negligible if e is small enough. On the other hand, multiuser diversity 
gain achieved by user s/+i depends on the size of Ti from which user s^+i is selected. Since the 
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elements in 7/ satisfy the orthogonality constraint in (fT6l ). a large threshold should be chosen to 
ensure a large user candidate pool. 

When the NUS is applied in a network (M, Nt, K), each user to be scheduled only needs to 
feed back M real scalars to the coordinated BSs at each time slot. Only the selected users need 
to provide their channel vectors to the BSs for precoding. 

B. Local Channel Aided NUS (LocalNUS) 

As analyzed earlier, the channel norm provides sufficient information for judging the orthog- 
onality between users only when they are not located in the same cell. As a result, the NUS 
is prone to select users located in different cells even when the small-scale fading channels of 
some users in a cell have good orthogonality. This will decrease the multiuser diversity gain, 
which can be overcome by a scheduler with the aid of full local channels. 

We have found that the upper bound fJ^i^^s in the NUS is not tight for two users located in 
the same cell, which is caused by amplifying |hj^^mhf!rnl to ||hjj.^m||||hs m||- If all users can 
feed back their local channels, we can compute |hj^^^mhf^^|, and thereby a tighter upper bound 



of cos^,, , can be achieved as 



yM 



Y.n=l ||hi^„„||||h^^„|| +t 



' sr~^ llVi I|2 /sr~^ iiVi 112 
where t = Ih,, mh?rr,l if user s, is also in the mth cell; otherwise t = llh,, mllllhs mil since 

I I'km'"' SjTfl\ J ' M f'k'm'"'\\ II bjiii\\ 

hs m is still not available in this case. 

By updating the ^i^^sj and the corresponding i^sn,^^ with /i^^s^ in the procedure of the NUS, 
we get a new scheduler which is named LocalNUS. To apply the LocalNUS, each user needs 
to feed back the full local channel and M — 1 norms of cross channels at each time slot. 



C. Large-scale Fading based Scheduler (LUS) 

With the NUS and the LocalNUS, the two-phase transmit strategy can effectively reduce the 
feedback overhead. However, scheduling delay is induced which will degrade the performance 
in time- varying channels. 

We propose a quasi-static scheduler to solve this problem. Considering the fact that the channel 
norm is dominated by the large-scale fading gain, then by replacing the square norms of sublink 
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channels ||hj^^„|p in (fT3] ) and (IT4l) with the corresponding large-scale fading gains ai^^^n, the 
upper bound ^J-i^^sj and the lower bound i^g^i^^^ can be approximately expressed as 

/^i*^^. = / ^^, / ^, and i/5^,^^ = a,,„ I 1 - }_/i,„„,, 1 . (22) 




By applying /iij.^s and z/^^^ in the procedure of the NUS, we obtain a large-scale fading 
gain based scheduler, called LUS. Since the large-scale fading gain varies very slowly, all users 
only need to feed back their large-scale fading gains in a very low rate. 

It should be noted that the quasi-static LUS is only applicable for the systems with asymmetric 
channels. When the LUS is used in single-cell systems, it is equivalent to select the users with 
the largest average channel gains which generally does not perform well, since the orthogonality 
between users is independent of their large-scale fading gains in this case. 

D. The Fairness Issues 

The schedulers considered so far favor users with good channel conditions, which follows 
from the selection criterion of (flTI ). As a result, only users at the cell center will be served, 
which contradicts to the goal of CoMP to improve the cell-edge throughput. This can be solved 
by combining the proposed schedulers with fair scheduling algorithms such as Round-robin (RR) 
or proportional fair scheduler. 

To demonstrate the performance of the proposed schedulers in CoMP systems, we extend them 
in a RR fashion similar to MW^, named the RR-NUS, the RR-LocalNUS and the RR-LUS, 
respectively. They select a group of users at each time slot and remove the selected users from 
the user pool at next time slot. The RR-LUS can be quasi-statically performed over a relatively 
long period, during which the pre-selected users feed back their CSI for precoding at each time 
slot. Since equal time slots are allocated to all the users, a short-term fairness is guaranteed [|20ll . 

V. Feedback Overhead Comparison 

A. Channel Quantization Model 

We consider quantized feedback to provide CSI at BSs with the following assumptions. 
Assumption 1: Each user has perfect downlink channels. 

We express the sublink channel from the mth BS to the ith user as hj^ = Pim^im, where 
pirn is the norm of hjm referred to as the CQI and Vj^ = hjm/||hjm|| is the CDI. Since the 
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global CoMP channel vector of each user consists of the sublink channel vectors from multiple 
coordinated BSs, we make the following assumption. 

Assumption 2: The GDI of the sublink channel vector from every coordinated BS is individually 
quantized for each user. The ith user quantizes Vj^ to a unit norm vector chosen from a codebook 
Cim = [c^ 1, • • • , c^ j]^ according to the minimum distance criterion 

ji„ = arg min Ivi^c,^ I, m = l,...,M, J = 2^. (23) 

3=1,..., J 

In order to avoid the same channel quantization for different users, the codebook should be 
user specific. Here we consider a correlated random codebook Cim shown in f2T\. in which 
the unit norm vectors are generated by Cimj = e,jmj/||ejmj|| with correlated complex Gaussian 
random vectors eim,j ~ CJ\f{0, Rjm) for j G {1, . . . , J}, where Rj^ is the correlation matrix of 
the small-scale sublink channel him- 

The codebook Cim is predefined and known by both the zth user and the coordinated BSs. 
The ith user feeds back the selected codebook indices to its local BS, each with B bits. Apart 
from the GDI, the GQI should also be quantized and sent back. Due to the ease of quantizing 
a scalar pim, we make the third assumption as in [fT5l and ifTSl . 

Assumption 3: The GQI is perfectly known to the BSs through feedback. 

With the feedback for both the GDI and the GQI, the BSs can reconstruct the global channel 
vector of the ith user as h^ = [hji, . . . , ha/] with him = PimCim,jim- Based on the reconstructed 
channels, scheduling, ZF beamforming and power allocation can be conducted. 

To achieve a close approximation of the channel vector, a large-size codebook is generally 
necessary, which however is unaffordable in practical systems due to the huge feedback overhead. 
In practice, the number of feedback bits per user is limited not only by the capacity of its own 
feedback link [fTSl . but also by the capacity of the system feedback channel that is shared by all 
users lfT6l . Therefore, two constraints with respect to the size of the codebook are considered 
in this paper. We limit the number of feedback bits per user to Bu for quantizing M sublink 
channels, and also limit the total number of feedback bits from all users to Bt. 

B. Feedback Overhead with Different Schedulers 

Based on the channel quantization model, we analyze the codebook size when different 
schedulers are employed given the per-user and total feedback constraints, Bu and Bt. 



14 

Both the GUS and the SUS are designed for the systems with one-phase strategy, which 
require all users to feed back the quantized channels for scheduling and precoding. The size of 
the codebook, B, should satisfy that MB < 5„ and Af^KB < Bt, i.e., B < min(|f , j^). 

The proposed schedulers are designed for the system with two-phase strategy. For the NUS, 
at each time slot each user needs to feed back M real scalars (i.e., CQI) to the coordinated BSs. 
In addition the selected L users need to provide the channels for precoding. If we neglect the 
overhead for feeding back scalars, the size of the codebook should satisfy B < min(^, -^)- For 
the LocalNUS, at each time slot each user needs to provide the channel vectors of local channels 
and the CQI of cross channels, and then the selected L users need to feed back the channel 
vectors. The size of the codebook should satisfy B < min(-||-, jv/j^-^a/l )" ^°^ '•^^ LUS, the 
feedback overhead for large-scale fading gain is negligible, and at each time slot the L selected 
users feed back their channel vectors, which leads to a codebook size of i? < min(^, jg^)- 

The feedback overhead of different schedulers given the size of the codebook, as well as the 
size of the codebook given the per-user and total feedback constraints is summarized in Table 
U Note that the number of the scheduled users L for different schedulers differs. 

VI. Simulation Results 

In this section, we analyze the tightness of the lower bounds derived previously and evaluate 
the performance of the proposed schedulers via Monte-Carlo simulations. Except for the RR- 
NUS, the RR-LocalNUS and the RR-LUS, three relevant user schedulers, the RR-RUS (random 
user scheduler [[T2l[T3l ). the RR-GUS and the RR-SUS are also considered for comparison. 
Similar to the RR-NUS, they select a group of users to be served at each time slot according 
to the RUS, the GUS and the SUS, respectively. The ZFBF together with the optimal power 
allocation to maximize the achievable sum rate under PBPC is used for all schemes. The GUS 
under PBPC is extended from the scheduler proposed in [|22||. Assume that a scheduling period 
of RR-based schedulers consists of Q time slots, in other words, the total user pool is divided 
into Q user groups. Since the scheduling period of different schedulers differs, we will use Q 
to normalize the user's throughput in following performance evaluations. 

With the same system model and parameters as in Section IIII-BI except for each BS having 
two antennas in order to analyze the impact of local channels, we show the tightness of z/^^^ 
with respect to the position of user ik„i denoted by d2 in Fig. HI We take the normalized gap 
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between z/c,j, and z/'i* , as the metric, which is defined as (z/5,jj — i^c, M^^s.i, . We consider 
the case that Si includes one user located at the position of di = —50 m. It is shown that ^'^jj^.^ 
is not tight for the NUS and the LUS when d2 < 0, i.e., when the two users are in the same 
cell, and the largest gap occurs when the two users are at the same position. With the aid of 
local channels, the bound is tightened. As a result, the LocalNUS can improve performance 
significantly, which will be shown in the sequel. However, when user i^m is located at the cell 
edge, the bounds are still loose. 

The simulation setup is based on [23 J. In particular, we consider a CoMP system consisting 
of 3 coordinated cells surrounded by interfering cells as shown in Fig. [51 The interference from 
the non-cooperative cells is modeled as white noise. The BS-to-BS distance is 500 m, and the 
channel bandwidth is 10 MHz. The BSs transmit with a maximal power of 40 Watts, and the users 
have a receiver noise figure of 9 dB. The path loss exponent is 3.76, the lognormal shadowing 
standard deviation is 8 dB, and the mean power loss at the reference distance of 1 m is 36.3 
dB. The BS is equipped with a 4-antenna uniform linear array with half a wavelength antenna 
separation. Each cell has 20 uniformly distributed single- antenna users that are surrounded by 
rich scatterers and keep a minimum distance of 35 m from the BS. For each drop of users, the 
small-scale fading channels are modeled as Rayleigh fading with correlation at the transmitter 
side. The correlation matrix is generated following the single-bounce model [24J, where the 
angular spread of the Gaussian distributed scatterers at the user side is assumed to be 15 degrees 
unless otherwise specified. In the scenario of limited feedback, the correlation matrix is assumed 
to be known a priori for codebook generation. All the results are averaged over 100 drops. 

In Fig. [6l we plot the cell-average throughput and the cell-edge throughpuij achieved by each 
individual user using ZF precoding together with the RR-NUS, the RR-LocalNUS and the RR- 
LUS subject to the PBPC, as a function of the threshold e used in the three schedulers. We 
can see that both the cell-average and the cell-edge throughput are not a monotonic function 
of e for three schedulers. This is led by the complicate influence of e on the performance. As 
have analyzed earlier, at each time slot e will not only affect the user's receive power, but also 
affect the multiuser diversity gain. These two effects of e are counteracting, which should be 

'The cell-edge throughput is defined as the 5% point of the cumulative distribution function (CDF) of the user throughput 
normalized by channel bandwidth 1251 . In the simulations, the normalized user throughput is obtained via the Shannon capacity 
formula. 
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considered for selecting e. Besides, for a given total user number, the scheduling period of RR- 
based schedulers depends on the number of selected users at each time slot, which is determined 
by e. Again, this shows that the selection of e is not explicit. From the figure, the threshold of 
the RR-NUS, the RR-LocalNUS and the RR-LUS can be respectively chosen as 0.4, 0.8 and 
0.8 to achieve high cell-average and cell-edge throughput. 

Figure |7] shows the CDF of user throughput of the six schedulers when perfect CSI is assumed 
for precoding, where the RR-GUS and the RR-SUS use the perfect CSI, and the RR-NUS or the 
RR-LocalNUS only use perfect channel norm or perfect local channel and the norms of cross 
channels. Compared with the RR-RUS, other advanced schedulers show the obvious performance 
gain. At each time slot, the RR-RUS randomly serves as many users as possible because it has no 
criterion to determine the number of selected users, which leads to the shortest scheduling period. 
Despite that the RR-RUS may select a few high-rate users, the severe inter-user interference led 
by random grouping degrades most users' performance. The RR-GUS and the RR-SUS have 
high throughput due to the perfect knowledge of CSI at the BSs. The RR-NUS effectively 
reduces the feedback overhead but pays a performance penalty. Nevertheless, the gap between 
the RR-NUS and the RR-GUS/SUS is recovered by the RR-LocalNUS with additional feedback 
of local channel. The performance of the RR-LUS is slightly inferior to that of the RR-NUS, 
since the channel norms are dominated by the large-scale fading gains. We can see the impact 
of the tightness of Ug^ on the performance. Since the bounds are loose for cell-edge users, the 
channel norm-based schedulers intend to use single-user MIMO precoding to serve these users. 
This reduces the multiuser multiplexing gain and prolongs the scheduling period. As a result, 
the user throughput reduces. 

The performance of the six schedulers with limited feedback is shown in Fig. [8] and Fig. 
|9l where the angular spread at the user side is set to be 15 degrees and 35 degrees, which 
are the typical values usually considered in the evaluation [23] . To separate the impact of the 
special spatial correlation led by the channel asymmetry from the impact of the sublink channel 
correlation led by small angular spread, we also provide results for the channel with 360-degree 
angular spread which is in fact uncorrected in Fig. [lOl We limit the number of feedback bits for 
quantizing each sublink channel to be 4, which results in the maximal number of feedback bits 
per user as B^ = 12. To reflect the constraint on the total number of bits for all users, we set Bt = 
0.6MKBu in the simulation. From the figures we see that decreasing the angular spread results 
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in an improvement of throughput. This is because the considered correlated random codebook 
reduces the quantization error of the channels with small angular spread, which is observed in 
||26l . Compared with the case of perfect CSI in Fig. Ul it is shown that limited feedback with 
few bits largely affects the relationship between the RR-NUS and the RR-GUS/SUS. The gap 
between the RR-NUS and the RR-GUS/SUS is small for the angular spread of 15 degrees in 
Fig. [8l and the RR-NUS performs the best for the angular spread of 35 and 360 degrees in Fig. 
m and Fig. [TOl This is because the RR-NUS has the largest codebook for channel quantization 
as shown in Table IJ which provides a better precoder for mitigating inter-user interference. 

Finally, the impact of time- varying channels on the performance of the schedulers is shown in 
Fig. \TT\ The scheduling delay of one time slot of 5 ms is considered. The time-varying channels 
are generated based on Jakes' model ETll . As expected, the time-varying channels degrade the 
performance of the RR-NUS and the RR-LocalNUS due to scheduling delay, but have no impact 
on the performance of the quasi-static RR-LUS. 

VIL Conclusion 

We have studied low feedback user scheduling for CoMP MU-MIMO systems in this paper. 

By characterizing the statistics of the angle between two users' channel vectors and analyzing 
the tightness of a lower bound of the orthogonally projected norm, we have shown that the 
inherent asymmetric feature of CoMP channels enables the channel norm to provide sufficient 
information on determining the orthogonality between users. Based on this observation, we have 
proposed a channel norm-based user scheduler (NUS) and an enhanced local-channel aided NUS 
(LocalNUS). To reduce the scheduling delay of the two schedulers, we have proposed a quasi- 
static scheduler using large-scale fading gain (LUS). Simulation results show that the LocalNUS 
has fairly good performance with much less feedback overhead than the GUS and the SUS with 
perfect CSI, the NUS performs very close to or even better than the GUS and the SUS when 
limited feedback is considered, and the LUS performs well in time- varying channels. 

The proposed schedulers are applicable to both the downlink BS cooperative transmission 
systems and the distributed antenna systems, as well as BS and relay cooperative systems, 
where the channels exhibit asymmetric feature. 
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Appendix A 
Joint pdf of q 

We first investigate tlie pdf of vi, then witli the derivation of the conditional joint pdf of q 
given vi, /q(x) can be obtained according to the Bayes' rule. 

Define hi = [v^iC-^'^S . . . , ^/ijqe^'^'^] and r/ = ||hip = Yl,n=i ^"- Then vi can be expressed as 

vi = [v^e^'^S . . . , ^/%e^'t'^] with (5„ = inh^ where < 5„ < 1, < 0„ < 27r, n = 1, . . . , A^, 

and 6n = 1- Yln=Q ^n- 

The joint pdf of 7], ^ and 0, /^^^,^(x,y, z), is given in [28|, where ^ = [^i, . . . ,^Af_i] and 
(j) = [01, ... , 0iv], from which we can obtain the joint pdf of 5 and (j) as 



(24) 



fsAy^ z) = / X frj,^,,i>ix, y, z)dx, 
Jo 

where S = [6i, . . . , 5n-i], and x^^^ is the Jacobian determinant. 

Given 5 and (f) (i.e., given vi), it is not hard to find that the vector [h2vf , . . . , h2V^] follows 
the joint complex Gaussian distribution CA/'(0, VR2V^). Note that g„ = |h2v;f p, n = 1, . . . ,N. 
Then by following the work in [|29l . we can get the conditional joint pdf of q given 5 and (f) as 
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where the operator (x)^ = a;(a; + 1) . . . (x + r — 1), the function L(x) satisfies 

[L(x)]'"[L(x)]" = [L(x)]'"+" and [L(x)]'" = L„(x), (26) 

and Lm{x) is the Laguerre polynomials of degree m JSOl (8.970)]. C„i,...,„^ is the Taylor 



expansion coefficient of g{0) = | 

Cni,.,njv = (ni!...nAr! 
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and ajj and bij respectively denote the element at ith row and jth column of 3f^{VR2V^} and 
S{VR2V^} fori, J G {1,...,A^}. 
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Based on (l24l) and (|25] ). we can obtain the joint pdf of q as 

/q(x) = / / /q|5,0(x)/5,^(y,z)c/yc/z. (28) 

Jo<zi,...,zjst<2TT J yi-\ l-i/jv_i<l 

It should be noted that the values of q are not completely determined by vi, since the standard 
orthogonal basis V generated from vi is not unique. However, different V will lead to the same 
angle as cos^ 9 = gi/||h2|p. Therefore, to obtain the joint pdf of q from (|28]) we only need to 
consider a specific V generated from vi, e.g., by means of the Gram-Schmidt process. 
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TABLE 1 
Comparison of feedback overhead and codebook size 
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Fig. 1. Network layout of centralized CoMP systems. 
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Fig. 2. The considered two-cell CoMP system for numerical analysis. 
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Fig. 3. The pdf of cos 6 with respect to different user locations. 
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Fig. 4. Normalized gap between vsiikm ^^'^ ^Siii, with respect to the position of user ijcm- To simplify the notation, here we 
use u'q i to denote the lower bounds of us,ii. for the NUS, the LocalNUS and the LUS. The results are averaged over 10000 

'~^l ''km ' Km <~^ 

small-scale Rayleigh fading channels consisting of independent and identically distributed (i.i.d.) complex Gaussian variables 
with zero mean and unit variance. 
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Fig. 5. Network layout for simulation; a cooperative cluster consisting of 3 coordinated BSs surrounded by 9 interfering cells. 
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Fig. 6. Cell-average and cell-edge throughput of the NUS, the LocalNUS and the LUS as a function of e with perfect CSI for 
precoding. 
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Fig. 7. The CDF of the throughput achieved by each individual user assuming perfect CSI for precoding. The thresholds for 
the SUS, the LocalNUS, the NUS, and the LUS are chosen as 0.5, 0.4, 0.8 and 0.8, respectively, which are also used in Fig. 
H Fig. [91 Fig. [TO] and Fig. [ID 
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Fig. 8. The CDF of the throughput achieved by each individual user considering limited feedback. The angular spread is set 
to be 15 degrees. 
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Fig. 9. The CDF of the throughput achieved by each individual user considering limited feedback. The angular spread is set 
to be 35 degrees. 
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Fig. 10. The CDF of the throughput achieved by each individual user considering limited feedback. The angular spread is set 
to be 360 degrees. 
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Fig. 11. The CDF of the throughput achieved by each individual user in time-varying channels. The operation frequency is 2 
GHz. The results of the LUS for v = km/h and f = 30 km/h overlap with the result of the NUS for t; = 30 km/h. 



